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Box SI I Data collection and analysis methods 
Data collection 

The data presented in this article is pubHcly available from the FDA website. We collected data regarding new drugs 
(new molecular entities, NMEs and new biological entities, NBEs for therapeutic use) approved by the FDA's Center 
for Drug Evaluation and Research (CDER) and FDA's Center for Biologies Evaluation and Research (CBER) from 1 
January 2003 until 31 December 2013 and listed in annual reports available in the FDA's website'. We excluded non- 
therapeutic agents such as antidotes, diagnostic and imaging agents, adjuvants and cosmetic products. We added one 
new biologic that was not included in the FDA list, but it was referred as new treatment for prostate cancer in another 
document from the FDA^ (sipuleucel-T (Provenge)), approved in 2010: a new autologous cellular immunotherapy 
designed to stimulate a patient's own immune system against cancer, indicated for the treatment of asymptomatic or 
minimally symptomatic metastatic castrate-resistant (hormone-refractory) prostate cancer). 

We considered the following information: year of approval, therapeutic area of the drug (cancer, cardiovascular, 
dermatology, gastro-intestinal, metabolism and endocrine system, genitourinary, hematology and immunology, 
infections and parasitic diseases, musculoskeletal, nervous system, ophthalmology, respiratory), the assignment to 
special-designation programs (orphan designation. Fast Track, Priority Review and Accelerated Approval), whether 
the drug was a biologic (large molecule drug, independently of the approval pathway), whether the drug was approved 
for more indications, whether the drug was intended for a genetic disease, whether the drug was approved at the first 
review cycle, the number of pivotal trials considered for efficacy claim, the total number of patients recruited in the 
pivotal trials and the time from submission to approval. 

We searched the FDA website from February 2012 until January 2014 examining the earliest version of the review 
documents. We classified the therapeutic areas using the Anatomical Therapeutic Chemical (ATC) classification 
system and the Defined Daily Dose (DDD) Index 2014 and the International Statistical Classification of Diseases and 
Related Health Problems 10* Revision, resolving conflicts between the two classifications with our own judgment. 
Without a clear indication of the pivotal trials, we considered those presented in the efficacy section of the label. We 
primarily considered the number of patients enrolled in the pivotal trials or — if this information was missing — the 
number of patients treated or evaluated for efficacy. We only considered the indication(s) relative to the first approval. 

Data analysis 

We analyzed each of the efficiency indicators separately using the statistical methods briefly described below: 

• The proportion of drugs approved at the first review cycle was analyzed with a Bayesian multilevel logistic regression 
model' with the probability of approval at the first review cycle assumed to vary across the therapeutic areas. We 
considered the following characteristics of the drugs as explanatory variables: assignment to a special- designation 
program, approval for multiple indications, biologic drug, and drug to treat a genetic disease. The estimated odds 
ratios for the explanatory variables and their 95% credible interval (in Bayesian statistics the interval within which 
the "true" parameter of interest, in this case the odds ratio, lies with 95% probability) are shown in FIG. la. 

• The number of pivotal trials per drug was classified in four categories — 1, 2, 3 or 4, or more pivotal trials — and 
analyzed with a Bayesian ordered categorical logistic regression modeP. We considered the same characteristics 
as in the first model as explanatory variables and additionally the proportion of drugs approved at the first review 
cycle. The estimated odds ratios for the explanatory variables and their 95% credible interval are shown in FIG. lb. 

• The logarithm of the average number of patients per pivotal trial was analyzed with a Bayesian multilevel linear 
regression model, with the mean number of patients per pivotal trial assumed to vary across the therapeutic areas. 
We considered the following characteristics of the drugs as explanatory variables: orphan drug designation, approval 
for multiple indications, biologic drug, drug to treat a genetic disease and the proportion of drugs approved at the 
first review cycle. The estimated ratios for the explanatory variables and their 95% credible interval are shown in 
FIG. Ic. 
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• The logarithm of time from submission to approval was analyzed with a Bayesian multilevel linear regression model, 
with the mean time from submission to approval assumed to vary across the therapeutic areas. We considered the 
following characteristics of the drugs as explanatory variables: assignment to Priority Review, approval for multiple 
indications, biologic drug, drug to treat a genetic disease and the proportion of drugs approved at the first review 
cycle. The estimated ratios for the explanatory variables and their 95% credible interval are shown in FIG. Id. 

The characteristics of the drugs which we considered as explanatory factors in the previous models are summarized 
and displayed with graphical presentations in Supplementary information S2 (figure) and Supplementary information 
S3 (figure). The efficiency indicators are summarized and displayed with graphical presentations in Supplementary 
information S4 (figure). All the analyses were performed with the software R version 3.1.0* and WinBUGS version 
I.4.P. 
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